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. For long memory time series models with uncorrelated but dependent errors, we 

. establish the asymptotic normality of the Whittle estimator under mild conditions. 

I Our framework includes the widely used FARIMA models with GARCH-type in- 

d [ novations. To cover nonstationary fractionally integrated processes, we extend the 

^ ' idea of Abadir, Distaso and Giraitis (2007, Journal of Econometrics 141, 1353- 

■ 1384) and develop the nonstationarity-extended Whittle estimation. The result- 

■ ing estimator is shown to be asymptotically normal and is more efficient than the 
I— i. tapered Whittle estimator. Finally, the results from a small simulation study are 

presented to corroborate our theoretical findings. 

c3 ■ 1 Introduction 

I In the recent two decades, there has been a great deal of research on long mem- 

I ory time series [see Doukhan et al. (2003), Robinson (2003)]. To model the long 

■ memory phenomenon, a widely used model is the FARIMA(p, d, q) (fractional au- 
' toregressive integrated moving average) model described as follows: 

cP{B){l-Bf-iXt-f^) = iP{B)ut, (1) 

o ; 

Q\ [ where is the mean, dx G (—1/2,1/2) is the long memory parameter, B is 

9^ ; the backward shift operator and (piB) = 1 - Yl^i (f>iB\ ^{B) = 1 + Y.i=i i^i^' 

' are AR (autoregressive) and MA (moving average) polynomials respectively. We 

^ ■ can the process {Xt} to be fractionally integrated with order dx, denoted as 

■ Xt ^ I{dx)- Typically {ut}tez are assumed to be independent and identically 
distributed (iid) random variables. In the modeling of financial time series, condi- 
tional heteroscedasticity is often found, so there is a surge of interest in the model- 
ing literature [see Baillie et al. (1996), Hauser and Kunst (1998a,b), Lien and Tse 
(1999), Elek and Markus (2004), Koopman et al. (2007)] to extend ^ into the 
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so-called FARIMA-GARCH model. Specifically, for a regular GARCH(r, s) model 
[cf. Bollerslev (1986)], we have 



r s 



2 ^ ^ 2 ^ ^ 2 



(2) 



i=l 1=1 



where {et} are iid random variables with zero mean and unit variance. Given a 
realization {Xi, ■ ■ ■ from ([1]) with ut generated by the joint estimation 

of the parameter vectors involved in both FARIMA and GARCH models has been 
investigated by Ling and Li (1997). In practice, one needs to specify the orders 
of FARIMA and GARCH models before doing the joint estimation. Hence it's 
customary to estimate the FARIMA model ([T|) first, then fit a GARCH model to 
the residuals with the orders selected at each stage of model fitting. It is apparently 
an important problem to reassess the applicability of the existing estimator of 
the parameter vector in the FARIMA model when ut is subjected to unknown 
conditional heteroscedasticity. 

In this article, we treat the dependence (including conditional heteroscedastic- 
ity) in {ut} nonparametrically. Specifically, we assume that {ut}t£Z is an uncor- 
related mean-zero stationary process and admits the following representation: 



where {et} are iid random variables and F is a measurable function for which ut is a 
well defined random variable. The framework ([3]) is very general and it includes the 
linear process ut = X^^q bi£t-i as a special case. It also includes various nonlinear 
time series models, such as bilinear models [Subba Rao and Gabr (1984), Giraitis 
and Surgalis (2002)], threshold autoregressive models [Tong (1990)], exponential 
GARCH [Nelson (1991)] and asymmetric GARCH models [Ding et al. (1993)]. 
One of the major goals of this paper is to study the asymptotic properties of the 
Whittle estimator of the parameter vector involved in ([T|) when ut follows ([3]) . 
The framework ([1]) can be easily extended to allow nonstationarity. Let 



where m > is the number of times Yt needs to be differenced to achieve sta- 
tionarity. According to Definition 1.1. of Abadir et al. (2007), Yt ~ I{d), where 
d = dx + rn- Alternatively, Yt is called an I{d) process of type I. Another type of 
fractional integrated process, that is called type II process, differs from the Type I 
counterpart in terms of presample treatment. See Marinucci and Robinson (1999), 
Robinson (2005) and Shimotsu and Phillips (2006) for detailed discussions of their 



Ut = F(- • • ,et-i,et) 



(3) 



{I - B)'^Yt = Xt, t = l-m,2-m, ••• 
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differences. Estimation of nonstationary FARIMA processes under parametric as- 
sumptions has been investigated by a few researchers; see Beran (1995), Velasco 
and Robinson (2000) and Mayoral (2007) among others. Ah the work mentioned 
above imposed either conditionally homoscedastic martingale difference or stronger 
iid assumptions on ut- Since the Whittle estimator is not consistent when d > 1 
(see Theorem 13. ip . Velasco and Robinson (2000) proposed the tapered Whittle es- 
timator and proved its consistency and asymptotic normality. Tapering has been 
frequently used in the inference of fractionally integrated time series and it has 
nice property of annihilating the nonstationarity. However, tapering inevitably 
inflates the variance of the estimator and therefore results in a loss of efficiency. 
Recently, in the context of local Whittle estimation, Abadir et al. (2007) developed 
extended Fourier transform and periodogram to handle the nonstationarity. Here, 
we generalize their idea to Whittle estimation and propose the nonstationarity- 
extended Whittle estimator, which is shown to be consistent and asymptotically 
normal with higher efficiency than the tapered Whittle estimator. 

The following notation will be used throughout the paper. For a column vector 
X = (xi, ■ ■ ■ ,Xq)' S MF, let |x| = (l^j=i x|)"^/^. Let ^ be a random vector. Write 
C G (p > 0) if WiWp := [E(|^|P)]i/P < oo and let || • || = || • h- For ^ £ define 
projection operators Vk^ = K{^\J^k) - ^i£,\J^k-i), = (• • • ,£k-i,£k)- Let C > 0, 
Cj > 0, j = 1,2, • • • denote generic constants which may vary from line to line. 
Denote hy and -^p convergence in distribution and in probability, respectively. 
The symbols Op(l) and Op(l) signify being bounded in probability and convergence 
to zero in probability respectively. Let N{fj,, S) be a normal distribution with mean 
fi and covariance matrix S. Denote by [aj the integer part of a. 

The paper is organized as follows. In Section [21 we state technical assump- 
tions and derive asymptotic distributional theory for the Whittle estimator in the 
stationary case. Section [3] proves the inconsistency of the Whittle estimator in 
certain nonstationary region, introduces the nonstationarity-extended Whittle es- 
timator and discusses its asymptotic properties. In Section HI we present Monte 
Carlo simulation results for the Whittle estimator, the tapered Whittle estimator 
and the nonstationarity-extended Whittle estimator. Finally, the conclusions are 
made in Section [5] and technical details are relegated to the Appendix. 
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2 Whittle Estimator (m = 0) 



Throughout, we consider the fohowing framework, which is more general than ([T]). 

oo CXJ 

Xt = Y,cij{e)ut-j, J^flfW < oo, ao{e) = 1. (4) 

3=0 j=0 

Let i = be the imaginary unit. For a complex number c, let c be its conjugate. 
For a process {Zt}t£Z, define the periodogram 



1 " 

Iz{X) = IwziX)]"^, where wz{X) = 'Wz,n{X) = —7== Z, 



Let Aj = 2iTj/n,j = 0, 1, • • • , re — 1, be the Fourier frequencies. Let A{X]9) = 
Yl'jLo^ji^)^^'''^ transfer function and denote by A{X) = A{X;9o), where 

is the true value of 6. Denote by G{X;9) = \A{X;6)\'^. Then the spectral density 
function of Xt is fx{X;6) = G{X;6)a'^ /{27r), where = vaic{ut). Denote by 
fx{X) = fx{X;eo). 

The Whittle estimator On is defined as 



n ri — l J (\ \ 

4, = argminggeQnW, Qn{0) = T^TT^V (5) 

where C is compact. Further we estimate <t^ by = Qn{Gn)- Note that the 
zero frequency is excluded in Qn{9) for the purpose of mean correction. 

Throughout, assume that lies in the interior of B. In particular, = dxo 
is an interior point of G^^^ = [01,02] with —1/2 < ai < 02 < 00. Hereafter we use 
6^^^ and 9^~^^ to denote the first element and the remaining elements of a vector 
9 respectively; 0^^^ and 0^"^^ denote the sets for the first element and remaining 
elements respectively. 

To establish the consistency and asymptotic normality of we make the 
following assumptions. 

Assumption 2.1. Assume fx{X) ~ \X\-'^'^^oG as X^ 0, where dxo e (-1/2, 1/2) 
and G G (0, oo). Further we assume that 

\dA{X)/dX\ < C\A{X)\\X\~\ X e (0,7r]. 
Assumption 2.2. Efci.fca.fcaGZ |cum('Uo,tifci,tifc2)'"fc3)l <°^- 
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Remark 2.1. Summability conditions on joint cumulants are widely adopted in 
spectral analysis. For a linear process ut = J2jez ^j^t-j with ej being iid, As- 
sumption [2r2] holds if X^jgg \bj\ < oo and ei £ C^. For nonlinear processes ut, it 
is satisfied under a geometric moment contraction (GMC) condition with order 4 
[see Wu and Shao's (2004) Proposition 2]. The process {ut} is GMC with order 
a, a > 0, if there exists a p = p{a) G (0, 1) such that 

E(|< - uX) < Cp"", n G N, (6) 

where < = F(- • • ,e'_^,£Q,ei, ■ ■■ ,e„) and {^Itez is an iid copy of {ejtgz- The 
property Q indicates that the process {u„} forgets its past exponentially fast and 
it can be verified for many nonlinear time series models [Wu and Min (2005), Shao 
and Wu (2007a)]. Define the 4th cumulant spectral density 

1 / ' 

f4{wi,W2,W3) = ——3 ^ cum(uo,Ufci,Ufc2,Ufc3)exp -i^wjkj 

Under Assumption 12.21 /4(-,-,-) is continuous and bounded. In Shao and Wu 
(2007b), another set of sufficient condition for Assumption 12.21 is provided. 

Assumption 2.3. Suppose ut £ . Let u'j^ = F {■ ■ ■ , e_i, Eq, ei, • • • ,£k) and64{k) = 
\\uk — u'j^Wi- Assume X]fcLo^4(^) ^ 

Remark 2.2. Interpreting (l3|) as a physical system, Wu (2005) introduced the 
physical dependence measure 5q{k) := \\uk — u'f^Wq, Q > ^- Intuitively, 5q{-) quan- 
tifies the dependence of Uk on sq by measuring the distance between and its 
coupled version u'f^. Wu (2005) showed that Assumption 12.31 is true if ^ holds 
with a = 4. In other words. Assumptions 12.21 and 12.31 are both implied by the 
GMC (4) condition, which has been verified for GARCH models of various forms; 
see Wu and Min (2005) Proposition 3 and Shao and Wu (2007a), Proposition 5.1. 

Now we introduce some regularity conditions on G{X;6). Similar conditions 
can be found in Fox and Taqqu (1986), Dahlhaus (1989), Giraitis and Surgalis 
(1990) and Velasco and Robinson (2000). 

Assumption 2.4. For any 5 > 0, the following conditions hold for X G [0,27r]. 

1. The function J^^logG{\;6)dX{= 0) can be differentiated twice under the 
integral sign. 

2. 9i 7^ 62 implies that {A : G(A, 9i) 7^ G(A, ^2)} positive Lebesgue measure. 
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3. dG{X;0)/dX is continuous at all {X,0) except A = 0, and 



\GiX;0)\<C\X[ 



~2d 



dGiX; 



dX 



<c\x\ 



4. dG~\X;9)/d6j, d^G~\X;e)/dejdek and d^G-\X;e)/dejdekdei are contin- 
uous at all (A, 6) except A = 0, and for j,k,l = 1, - ■ ■ ,s, 



dG-^X;9) 



<C\X\ 



2d-S 



a2G-i(A;( 



d9S9k 



<c\x\ 



2d~S 



d^G-^{X;i 



ddjddkdei 



<c\x\ 



2d~S 



5. d^G-^{X;e)/dXd9j and d^G-^{X;e)/dXd9jd9k are continuous at all {X,9) 
except A = 0, and 



d^G-^X; 



dXd9i 



<c\x\ 



2d~l~S 



d^G~^{X;9) 



dXd9jd9k 



<C\X\ 



2d~l~5 



j, k — 1,2, ■ ■ ■ , s. 



For the FARIMA model defined in ([T]), Assumption 12.41 can be easily verified 
when ^(-B) and ipiB) have all roots outside the unit circle. 
Let W^'^\6) be the s x s matrix with (j, k)th entry 



2tt 







d9jd9k 



and r^*^) (9) be the s x s matrix with (j, A;)th entry 



2a^ r d log G{X; 9) d log G{X;t 



d9i 



d9k 



-dX 



JO 



Theorem 2.1. Suppose Assumptions \2.i\\2.4\ hold. Then — >p 

M9n-eo) N{o,w'^''Heor'r(''H9o)w^^\9or'). 



cj^ anc 



(7) 



Asymptotic theory for the Whittle estimator has a long history. Early work 
by Walker (1964) and Hannan (1973b) dealt with short-range dependent process. 
For long-range dependent process, see Fox and Taqqu (1986), Dahlhaus (1989), Gi- 
raitis and Surgailis (1990) and Velasco and Robinson (2000) among others. All the 
works mentioned above assume either Gaussian processes or linear processes with 
iid or conditionally homescedastic martingale difference innovations. In a multi- 
variate setting, Hosoya (1997) obtained the asymptotic normality under certain 
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mixing conditions on the conditional moments of Uf. However, the latter author 
did not mention how to verify those mixing conditions for statistical models. In 
comparison, our assumptions on {ut} have been verified for various nonlinear time 
series models, including GARCH-type models; see Remarks 12 . 1 1 and 12 . 21 In general, 
our Assumption 12.31 is based on physical dependence measure and is not directly 
comparable to the mixing conditions imposed by Hosoya (1997), except in some 
special cases. The following example demonstrates that our condition is slightly 
weaker. On the other hand, we impose the structural assumption ([3]) on ut but 
Hosoya (1997) did not. 

Let Ut = £tYlj^i^j^t-j> where et are iid random variables with mean zero, 
unit variance and finite eighth moment. Assume aj ~ Then if k > 1, our 

Assumption 12.31 is satisfied. In the assumption A of Hosoya (1997), it is required 
that for <t <ti, 

var(E(u^JJ^t) - K{uf)) = 0{\t - til"^"'), for some e > 0. (8) 

Note that 

t t t 

LHSof(i8])= wvjnuhm? = ii^Xii'= E w^K-jf- 

j oo j=—oo j=—oo 

After straightforward calculations, we have "Po^^tj-j = '^o.ti-j^o ^k^ti-j+i <^k£ti-j-k+ 
o?i_i(^o - So dH) holds only when k > 3/2. 

Remark 2.3. Whittle estimation has been applied to the parametric GARCH 
models based on squared observations; see Giraitis and Robinson (2001). Note that 
for the GARCH model ([2]), the squared series {uf} follows an ARMA(max(r, s), s) 
model [Fan and Yao (2003)], i.e. 

max(r,s) s 

i=i j=i 

where Or+j = Ps+j = for j > I, et = Xf — is a martingale difference 
sequence. Giraitis and Robinson (2001) adopted a more general framework and 
obtained a central limit theorem for the Whittle estimator under an 8-th moment 
condition on ut- Note that Ling and McAleer (2002) provided a sufficient and 
necessary condition for the existence of the eighth moment for ut, which implies 
that Ut = G{- ■ ■ ,£t-i,£t) for some measurable function G and ut is GMC with 
order 8; see Proposition 5.1 in Shao and Wu (2007a). Following the argument in the 
latter paper, it is not hard to show that et admits a nonlinear causal representation. 
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i.e. et = J(- • • ,ei_i,et) for some measurable function J, and et is GMC of order 
4. In view of Remarks 12.11 and 12.21 our Assumptions 12.21 and 12.31 are implied by 
GMC (4), so our Theorem 12. II is directly applicable to this setting. 

It is worth mentioning the work of Zaffaroni and d'ltalia (2003), who studied 
Whittle estimation of long memory volatility models with ARM A levels. It seems 
our result is not applicable to that setting. In addition, there are models that allow 
long memory in both conditional mean and conditional variance [cf. Giraitis and 
Surgalis (2002)], for which our theory no longer applies. Under our framework, we 
allow long memory in the level but the square of the conditional heteroscedastic 
error needs to be short-range dependent, so we exclude models that have long 
memory in volatility. 

Remark 2.4. The asymptotic covariance matrix in ([7]) admits a different form 
compared to those in the literature. Below we show that under extra conditions 
on ut, our asymptotic covariance matrix is the same as those existing ones; compare 
Fox and Taqqu (1986), Dahlhaus (1989), Giraitis and Surgalis (1990) and Velasco 
and Robinson (2000). A key assumption in Velasco and Robinson (2000) is that 



Under ([2]), cvLVii{uQ,uj^^,uj^,^,uj^.^) = E(uq) — ?>a^ only when /ci = ^2 = ^3 = 
and zero otherwise. This implies that fi{wi,W2^wz) = ^{uq) — 3a"^]/(27r)'^ for 
all {wi,W2,W'i) G [— 7r,7r)^. If ut are iid or Gaussian [see Fox and Taqqu (1986), 
Dahlhaus (1989) and Giraitis and Surgalis (1990)], the fourth order spectrum is 
also a constant. Consequently, 



and the asymptotic covariance matrix in ([7]) reduces to 47rS(0o) ^) where the 
(j, /c)th entry for S(0o) is 



Note that we have apphed the fact that M^jf Vo) = (f^V^r) /J" ^iHSglMa) ^i2gWo)^^_ 



Hence, our asymptotic covariance matrix coincides with those presented in The- 
orem 4 of Giraitis and Surgailis (1990) and Theorem 2 of Velasco and Robin- 
son (2000) for p = 1, i.e. the untapered case. Theorem 12.11 suggests that con- 
ditional heteroscedasticity affects the asymptotic covariance matrix through the 



K{ut\J-'t-i) = 0, K(ul\J^t-i) = constant for j = 2,3, 



4. 



(9) 
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non-constant fourth order cumulant spectra of Ut- For some non-Gaussian pro- 
cesses, Giraitis and Taqqu (1999) demonstrated that the Whittle estimator may 
not be -^/n-consistent and the hmiting distribution may not be Gaussian. Our 
results have different apphcabilities. 

To construct a confidence region for one can estimate the asymptotic co- 
variance matrix directly, which involves the estimation of the integral of the fourth 
order cumulant spectra. For short memory time series, the latter problem has been 
studied by Taniguchi (1982), Keenan (1987) and Chiu (1988), but the apphcabil- 
ity of their methods to long memory time series is not clear. Alternative methods 
that bypass direction estimation are currently under investigation and we hope to 
report that in the near future. 

Remark 2.5. Assumption 12.11 excludes seasonal long memory processes, such as 
Gegenbauer process [Gray, Zhang and Woodward (1989)], in which the spectral 
density function has a pole at a nonzero frequency. The work by Velasco and 
Robinson (2000) seems to allow for such processes. 

3 Nonstationary case 

In this section, we shall consider the nonstationary case, i.e. m > 1. For the 
convenience of presentation, we assume that G{X;6) = |1 — e^^\~'^'^^ G{X;9^~^^), 
i.e. the spectral density function of Xt can be factorized into a product of the 
fractional integrated component |1 — e^^\~'^'^^ and the short memory component 
G{X; 6^~^^). This adds a slight constraint for the class of models, but is not overly 
restrictive due to the prevalence of the fractionally integrated models in practice. 
Define H{\;9) = |1 - e'-^\-^'^G{X;9'^-^'>), where d = m + dx is the fractional 
integration order of Yt. Note that m = [d + 1/2J . 

3.1 Inconsistency of the Whittle estimator when do > 1 

The consistency of the Whittle estimator has been obtained by Velasco and Robin- 
son (2000) for do G (1/2, 1) and it is still unknown whether the Whittle estimator 
is consistent when do > 1. A semiparametric frequency-domain approach to es- 
timating the order of fractional integration, that is closely related to the Whittle 
estimation, is the so-called local Whittle estimation. The local Whittle estima- 
tor of d is consistent when do £ ( — 1/2,1/2) U (1/2,1] and is inconsistent when 
do > 1; see Phillips and Shimotsu (2004) and Shao and Wu (2007b). Similar 
results can be expected for the Whittle estimator due to the similarity in the 
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theoretical justifications for these two estimators. Here we shall show the incon- 
sistency of the Whittle estimator when do £ (1,3/2), which provides a sound 
motivation for the consideration of the nonstationarity-extended Whittle esti- 
mation (see Section I3.2p . Using a similar argument, one can show that when 
do £ {d > 3/2, d 7^ {2k + l)/2, k = 2,3, ■ ■ ■ }, the Whittle estimator is inconsistent. 
Since the proof does not involve additional methodological difficulties, we omit the 
details. 
Define 

0: = argmin,g0M„(0), M„(^) := — V JxM. 



n 



Denote by 0* = argmingg0M(0), where M{e) = /^^ |1 - e'^\-'^H-^{\;e)d\. Let 
K{\;e) = |1 -e*^|2'^i7(A;0). 

Theorem 3.1. Suppose Assumptions [2n^2.3\ and Assumption \2.4\ (with G{X;9) 
replaced by H{\;6)) hold. Further, for all A and 0, < Ci < K[\;6) < C2 < 00. 
Assuming do G (1,3/2), then 0* — >p 6*. 

Like the local Whittle estimator, the Whittle estimator of d converges to 1 in 
probability when d^ G (1,3/2), since = 1. Further, we conjecture that when 
do = 1, the Whittle estimator is also consistent. 

3.2 Nonstationarity-extended Whittle Estimator 

In this subsection, we propose the nonstationarity-extended Whittle estimator fol- 
lowing the idea of Abadir et al. (2007), who introduced the extended Fourier 
transform and periodogram to deal with nonstationarity in the local Whittle esti- 
mation. Note that the extended discrete Fourier transform and periodogram have 
been suggested in an early work by Phillips (1999) for do £ (~l/2,3/2). 

Assume do = niQ + dxo ^ {p — l/2,p G Z,}, where mo is the true value of m. 
We define the extended periodogram as 

lY{Xj;d) := \wYiXj;d)\'^, WYiXj;d) = wriXj) + J{Xj;d), [01,02], 

where 



if d G [-1/2,1/2), 

e'^' Er=i(l - e'^^yZr, iidelm := [m - 1/2, m + 1/2), m G N, 



J{Xj;d) — < „ 



with Zr = {27rn)-'^/^{{l-By-^Yn-{l-BY-^Yo),r = 1,2,- •• ,m. As mentioned 
in Abadir et al. (2007), the enumeration of the data should be YL/i+i,-- - ,Yn, 
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where h = [02 + 1/2J V 0. For example, when 02 < 1/2, the enumeration is 
Yi, • • • , Yn, when 02 G [1/2, 3/2), the data is enumerated as Yq, • • • , Yn- 
Then the nonstationarity-extended Whittle estimator is defined as 

9^ = argmin,,ein(^), ^0) = - J2 ^iiMl- 

Again, a"^ is estimated by d"^ = Ln{On)- 

Theorem 3.2. Suppose Assumptions \2.1^2.3\ and Assumption \2.4\ (with G{X;9) 
replaced by H{\;6)) hold. Further, for all A and 9, < Ci < K{X;9) < C2 < 00. 
Then we have d"^ -^p and 

- 9o) iv(o, VF(^)(^o)-'r(^)(0o)VF(^)(^o)-'). 

Therefore, the nonstationarity-extended Whittle estimator is consistent and 
asymptotically normal irrespective of the true value of the fractional integration 
order. Further, the asymptotic covariance matrix admits the same form as the 
stationary case, whereas for the tapered Whittle estimator [Velasco and Robinson 
(2000)], the variance is inflated due to the exclusion of certain frequencies and 
the tapering effect, and the inflation factor gets large as the order of fractional 
integration increases since a higher order taper is needed to accommodate a larger 
d. 



4 Finite Sample Performance 

In this section, we examine the finite sample performance of the Whittle estima- 
tor 9n {9^), the nonstationarity-extended Whittle estimator 9n, and two tapered 
Whittle estimators proposed by Velasco and Robinson (2000) through a small 
simulation study. For tapered Whittle estimators, we use both cosine weights, i.e. 
ht = {1 — cos(27rt/n))/2 and Parzen's weights, where 



ht 



1 _6[|2t^|2 _ |2t^|3]^ N<t<3N 

2{l - 1 < t < iV, 3N <t<4N 



with = n/4. So the number of frequencies included in the objective functions of 
tapered Whittle estimators are [n/3 — Ij and [n/4 — Ij respectively. Two sample 
sizes n = 200 and n = 512 are investigated. 
Consider the following model 

{1-0.65B + 0.6B^){1- BfYt = ut, (10) 
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where 

ut = etou o] = 0.4 + 0.3n2_i + O.Scj^^^ (11) 

with £t independently generated from standard normal distribution. Thus the 
GARCH model (jlip admits a finite fourth moment; see Davidson (2004) for some 
sufficient conditions on the existence of higher order moments for GARCH models. 
Note that the FARIMA(2,d,0) model (fTOl) has been investigated in Velasco and 
Robinson (2000), but under iid assumptions on ut- Here we examine a wide range 
of d's from —0.4 to 2.4, including d = 1.5, which is not covered by our theory. We 
take [ai,a2] = [—.49,3.49]. Tables [1][3] report the bias and lOOxmean square error 
(MSE) for the estimates of d, 0i and (f)2 based on 1000 replications. In the tables, 
the symbols W-1, W-3, W-4 and EW correspond to the Whittle estimator, tapered 
Whittle estimator with cosine weights, tapered Whittle estimator with Parzen's 
weights, and the nonstationarity-extended Whittle estimator respectively. 

As we expected, the bias and MSE decrease as the sample size increases. It 
appears that the mean squared error for the nonstationarity-extended Whittle 
estimator is substantially smaller than those for two tapered estimators, and is 
similar to that for the Whittle estimator when d^ G (—0.5,0.5). The inconsistency 
of the Whittle estimator when d^ > 1 can be easily seen from both the bias and 
MSE. The tapered estimator using Parzen's weights shows a severe downward 
bias in estimating d, upward bias in estimating (pi when n = 200. Although in 
theory, the tapered estimator with cosine weights is still asymptotically normal 
when do = 2.4, the bias and MSE get noticeably large since it is close to the region 
of inconsistency, i.e. d^ > 2.5. The result for the case do = 1.5 does not seem to 
be very much different from those for other ds, which suggests the theory works 
for this case. 

We also tried three different models for ut- (i) iid N(0,1); (ii) asymmetric 
GARCH(1,1); (hi) regular GARCH(1,1) but with infinite fourth moments. The re- 
sults are qualitatively similar to what we observe here (results not shown). Overall, 
the nonstationarity-extended Whittle estimator outperforms both tapered Whittle 
estimators uniformly in the range of d examined here. Both theory and simulation 
studies suggest that the nonstationarity-extended Whittle estimator is preferable 
to the tapered Whittle estimator, so we recommend its use to the practitioners. 

5 Conclusions 

This paper presents an asymptotic theory for the Whittle estimator of a class of 
long memory time series models with uncorrelated but dependent errors. Our 
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dependence conditions on the errors are mild and can be verified for a large class 
of nonlinear time series models, including GARCH-type models. Following the 
idea in Abadir et al. (2007), we extend the range of consistency and asymptotic 
normality by developing the nonstationarity-extended Whittle estimator. Both 
theory and finite sample results demonstrate that the proposed estimator is more 
efficient than the tapered Whittle estimator [Velasco and Robinson (2000)]. It 
is worth noting that our framework is limited to Type I fractional process. For 
Type II process, the extended local Whittle estimation has been investigated by 
Shimotsu and Phillips (2005) and it would be interesting to extend their idea to 
Whittle estimation. 



6 Technical Appendix 

For the convenience of notation, write Aj = A{Xj), Ixj = Ix{^j), luj = Iu{^j), 
fxj = fxi^j), fuj = /«(Aj), ixj = Ixjfxj^ = lujfuj^, wxj = u^x(Aj), 

Wuj = Wu{Xj), j = 1, • • • , n - 1. Let gj = wxj/ y/fxj and hj = Wuj/ y/Twj- Denote 
by D{w) = Dn{w) = EILi^**'"- Let K{w) = {2Tm)-^\D{w)\'^ be Fejer's kernel. 
Denote by a V 6 = max(a, h) and a Ab = min(a, b). Let n := [n/2j . 



6.1 Proofs of Theorems [HH and lOl 

Proof of Theorem 12.11 The consistency of 9n can be proved along the line in the 
proof of Theorem 1 of Velasco and Robinson (2000). Since it is simpler than the 
proof of the consistency of 9n (see Theorem 13. 2|) . we skip the presentation. 
Applying the mean value theorem, we have 

where On = + a{9n — Oq) for some a G (0, 1). Under Assumption 12.41 we get 
by Lemma \6A\ that d^Qn{0)/d9^ —>p W^'^^O) elementwise uniformly m 9 € Gi, 
where 0i = [do — 1/2 + A, 02] x B^""^^ for some A G (0, 1/4). Following the same 
argument as in the proof of Velasco and Robinson's Lemma 7 (2000), y/n{9n — 9o) = 
W^^\9o)-^VndQn{9o)/dO + Op(l). Thus the conclusion follows if we can show 

r-dQrM _ d log GjXj; Op) . . . 

V^^^--^2^^x, g-^ >dN{0,T^ >i9o)). (12) 

For each h = 1,2,--- ,s, let /(A) = d log G{X;9o)/ 86 h and k = liXk)- Note 
that Z(Afc) = —G{Xk;9o)dG~^{Xk;9Q)/dOh. Under Assumption 12.41 we apply the 
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mean value theorem and obtain \lk — lk+i\ ^ Cn 
calculation. By Lemma [67 



via a straightforward 



E 



Xk — ^ukj 



k=l 



< 



fc+i 



IE 



k=l 



^(/Xj - luj) 



c 



<-Y, 1-^(^1/4(1 + log kfl^ + fci/2n-V4) + Cn^/^ log n = o(V^). 



n 
fc=i 

Thus it suffices to show 

71-1 



27r 



51ogG(A,;0o) 



50 



iv(o,r(«)(0o)), 



which is established in Lemma [6. 21 Finally, the consistency of o"^ follows from the 
consistency of On and Lemma l6.ll The proof is now complete. 

Proof of Theorem EU Let Bi = [1/2 + A,a2] x e^-^' and 62 = [ai, 1/2 + A) x 
0*^~i\ possibly empty. When do G (l;3/2), mo = 1 and dxo G (0,1/2). Since 



i\i \ — l 



wvi^j) = WY{Xj;do) - J{Xj;do) and wy( Aj;(io) = t(;x(Aj)(l - e*^^ 
write M„(0) = Mi„(0) - M2„(0) - M^) + M3„(0), where 



we can 



27r!^/x(A,)|l-e 



Mi„(0) 

M3„(0) 



n 



E 



i?(A,;0) 



M2„(0) 



27r 



n 

ri-l I T/> r m9 ^ n-1 I-, 'A, 1-2 



n-1 

E 



wy(Aj;(io)J(Aj;do) 
i7(A,;0) 



27r |J(Aj;do)P _ 27r |1 - e'-^ 



n 
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with Z\ = (27rn) Yl't=i -^t- ^PP^Y^^S the argument in Lemma A. 5 of Shao and 
Wu (2007b), it is not hard to show that 



< C3 < lim E{Zf)/n'^'^"''^ < C4 < 00. 



(13) 



Hence for 8 £ Qi, M-^niG) dominates the other two terms in magnitude. By 
Lemma EH sup^eei l^in(^) - M(0)| = Op{l), where M(0) = Z^fxiX)\l - 
e'^\-'^H-^{X;e)dX. Under Assumption [231 we have that M3„(6')/Zf M{9) 
holds uniformly for £ Qi. By the Cauchy-Schwarz inequality, it follows that 
supggQ^ \Mn{0) / Zf — M{9)\ and that with probability tending to 1, 

inf {Mni9)/Zf - Mn{9*)/Zf} > r/(e) > 0, for any e > 0, 

\6 — ^G0i 
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since M{6) is uniformly continuous in Qi. In view of the argument in Velasco and 
Robinson (2000), the conclusion follows if we can show that for any e > 0, 

P l^ini{Mn{e)/Zl - M(r)} < ^ 0. (14) 
Denote by p„ = [n/3j , Oj = {j/pnf^~^ if 1 < J < Vn- For 6 G 62, 

> ^EO-/Pn)^^"Y/y(A,) > ^E«.^"^HA,). 
i=i i=i 

Again, by Lemma ED the Cauchy-Schwarz inequality and (fT3|) . we have 

inf M„(^)/Z2 > Y.a,f\l-e^'^-Hl + o,{l)) > ^ a,(l + 0^(1)), 
where the above constant C does not depend on A. Since 

Pn 



n 

3=1 

which can be made arbitrary large when A > is sufficiently small. Thus ([1] 
holds since M[9*) is finite. This completes the proof. 



Proof of Theorem 13.21 The proof of the consistency closely follows the argument 
in the proof of Velasco and Robinson's (2000) Theorem 1. By definition, we have 

Iy(Aj;(i) = \wY{\j]dQ) + Tj{d)\^, where Tj{d) = J{Xj;d) — J(Aj;do), d G [01,02]. 

So Tj{d) = if m = mo- By Lemma 4.4 in Abadir et al. (2007), WY{Xj',do) = 
wxiXj){l - e^^O"'""- Write L„(^) = 1^(9) + L2n{e) + + LsniO), where 

^ 2^!^ |w(A,;do)P 27r!^ /x(A,)|l-e-^^|-^'"° 



pi H{Xf,e) n j^^ H{Xf,9) 



T (ff-. _ 27r WY{Xj;do)Tj{d) _ 27r \Tj{d)\'' 

L2n{e) - -1. H^^.o^ ' L-^r.ie) - - (15) 
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Define 



"^^"^^^ = T F(a;;^^ ' = — mxj) — 

For any A e (0, 1/4), let c^a = do - 1/2 + A and define Gi = [c^a, 02] x ©^"^^ and 
©2 = [oiit^A) X 0^~^\ possibly empty. In view of the argument in Velasco and 
Robinson (2000), it suffices to show that as n ^ 00, with probabihty tending to 1, 

inf {Ln{e) - LniOo)} > r]{e) > for any e > (17) 

|6»-6»o|>e,6le0i 

and 



P l^mf {Ln{e) - L{9o)} < ej ^ for any e > 0. (18) 

The statement (fTTl) is imphed by (i) inf|5)„gjj|>^ gg©^ l-^(^) — -^(^o)l ^ vi^) > and 
(ii). supggQj \Ln{0) — L{9)\ — >p 0. The former follows from the uniform continuity 
of L{9) on 01 and the identifiability conditions in Assumption 12. 4[ It follows from 
Lemma [6. II that sup^^gQ^ \Lin{0) — L{9)\ = Op(l), which consequently results in (ii) 
in view of Lemma 16.31 and the fact that supggQ^ \Lin{9) — L{9)\ = o(l). 

Next, we show (jlSp when 02 is nonempty. By Lemma 16.41 we have that with 
probability tending to 1, 



inf Lrr{9) > - y Xf^Ixi\l - e^^^ 



where the positive constant C above is independent of A. By Lemma 16. H the 
above term converges in probability to 



-E/x(Ai)|l-e^^^|-2™o|A,f'^^ > ^^|A,f^-i~c/ |A|2^"idA 
^ i=i " 3=1 -^0 

Cvr^^/A ^ 00 as A i 0. 



So the assertion ([T8|) follows by choosing A > such that Cvr^^/A > L{9q) + 2e. 
Therefore, 9n — >p ^o- 

To show the asymptotic normality of 9n, we define another (infeasible) estima- 
tor 9n by 

9n = argmingg0Li„(6'). 
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Since e'^n'^ do and do / P + 1/2, p € P{d\'> G V,) ^ 1. So P(4 / ^n) ^ 
and 0„ — >p ^0- Thus it suffices to show the asymptotic normality of On- Note that 



3(1) 



de 



de 



n-l 



2vr ^^ Ix{\j) dlog H{\j;( 



E 



5^ 



By a similar argument as in the proof of Theorem 12.11 and Lemma 16.21 we can 
show that 



n- 



dLi 



27r 



n-l 



89 



d\ogH{Xj;9o) 
80 



+ Op(l) A^(o,r(^)(0o)). 



Further, by Lemma IGT} d'^Lin{0)/8'^9 — >p W^^\9) uniformly in ^ G ©i elemen- 
twise. Thus the asymptotic normality of 9n holds and so does 9n- Finally, it 
follows from the consistency of 9n-, the continuity of H~^{X;9) with respect to 9 
and Lemma |6. II that 



n ^ 



j=\ H{Xj;9n) 
Ix{Xj) 



1 H{Xj;9o)\l - e*^^|2'"o 



+ Op{l)=a'^ + Op{l). 



This completes the proof. 







6.2 Lemmas 

The following lemma extends Lemma A. 2 in Velasco and Robinson (2000) to allow 
conditionally heteroscedastic errors. 

Lemma 6.1. Assume that the function (j){X;9) is even in X, periodic of period 27r 
and continuously differentiable in X and 9 except A = 0. Further assume that there 
exists a (5 G (0, 1) such that for j = 1,2, ■ ■ ■ ,s and all 9 G Q, 



m;9)\<Cf^\X)\Xr' 



d<PiX;9) 



8X 



< c/^Ha)|a|-i-^ 



(19) 



and 



dH>^;9) 
89i 



< C f^\X)\X\'' . 
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Let Jn{9) = 2Trn-^Yl]ZlH^j;0)Ix{Xj) and J{9) = f^^ ^{X;e)fx{X)dX. Suppose 
that Assumptions \2. Ii2.^2.4\ and \6.1\ hold. Then 

sup \ Jn{d) — Jid)\ = Op(l) as n ^ oo. 
eee 

Proof of Lemma l6.1l It suffices to show tlie pointwise convergence since the uniform 
convergence fohows from the equicontinuity argument in view of the compactness 
of e, and differentiability of (^(A; 9) in 9. Let = 27rn-i ^"^ll ^lj{\j;9)Iu{Xj), 

where ip{X;9) = (j){\;9)G{\;9o)- Hereafter in the proof, we suppress 9 and write 
jpj etc for ip{Xj;9) etc. Then 



\Jn -J\< 14 - IE(j;)| + |IE(j;,) - J| + I J„ - Jn\ 

Applying Lemma |6.8^ 



(20) 



E|j;-E(j;)|2 < Cn^"^ \i;ji;k\\cov{Iuj,Iuk) 



j,k=i 



Using the continuity of il^{X; 9) and /^(A) as well as the integrability of '(/'(A; 9)fuiX), 
we get 



mj'n) - j\ 



2 /■27r 

n f^^ Jo 



{X)dX 



0(1). 



Further, summation by parts yields 



Wn-Jnl < -E 

n 



<-yE 

n ^ 



k=l 



(J 

x\4>kfxk - (Pk+ifx{k+i) \ + —IE 



n 



"J J 



^h+lfx{n+l)\- 



By the mean value theorem, |(/>fc/xfc - </'fc+i/x(fc+i)l < C"?^ ^A^ ^ \ k = l,2,--- ,h 
under (fT9|) and Assumption 12. 4i Thus by Lemma [621 

„ h 



fc=i 
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Therefore the three terms on the right hand side of (j20p are ah of Op{l). This 
completes the proof. 



Lemma 6.2. Under Assumptions \2.S[ [273\ and \2.4\ 

Proof of Lemma EJl Under Assumption [231 f^'' dlog G(A; 9o)/d9dX = 0. So 

n—1 

= 'J^ + own)] + o(„-'/^) = o(„-v^ 

Thus it suffices to show that for any b £ M'^ , b'b = 1, 

h'{Tn - E(r„)} iV(0, al), (21) 

where (7^ = 6T(^)(0o)5. A major difficulty in proving (j2ip is caused by the fact that 
the first element of 91ogG(A; 9o)/d9 possesses a pole at zero frequency in the long 
memory case. We shall use a truncation argument to circumvent the problem. 

Write b'Tn = n-^l'^YJ'jZl luji^iW^ where V(A) = 27r6'(91og G(A; ^o)/5^)- For 
any c G (0,1/2), we define two 27r-periodic functions ij)i{X^c) = il){X)l{\\\ > 
c) + V(c)l(|A| < c) and V'2(A,c) = ?/'(A) - Vi(A,c), A G [-vr.vr). Let Tin{c) = 
n-y^YJ]Zllu,M^j.c) and r2„(c) = n-^'^YJ]ZllujM\,.c). Then ([II]) follows 
from the following two assertions: 

limsuplimsup var(r2,i(c)) = (22) 
cj.0 n— >oo 

and 

Tin{c) - E{ri„(c)} iV(0, al{c)) and lima^^^^) ^ (23) 

c|0 
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By Lemma 16.81 we have 



(72„(c)) <-Yl cov(4,-,4fc)(V'(A,) - V(c))(V'(Afc) - V(c)) 



C 

var _ 

n 
j,k=i 

^ [H ^ [cn] 



< C j /n'(A)^2(A)(iA + i,\c)c + E l^(^j) - 



cn] 

which tends to zero as c | 0. So (|22]) holds. 

We shall further approximate Ti„(c) using techniques in Fourier analysis. Let 
dk = (2vr)~"'^ /^^ ?/'i(A, c)e*'^'^(iA be the Fourier coefficient of il)i{\,c). For a fixed 
/i G N, let '0h(A) = X]|fc|<h(l ~ |^|//i)c?A:e~*'^'^ be the Cesaro mean of the first h 
Fourier approximations to '(/'i(A, c) and i^hW = ^i(A, c) — 'i/'/i(A) be the remainder. 
Write Tin{c) = n~^/^ Sj=i ^ujii^hi^j) + V'/i(Aj)}. It is not hard to see that 

n— 1 \ ^ n— 1 



limsupvar j Iujil^hi>^j) ] =limsup- ^ cow{Iuj,Iuk)'4^h{^Mh{^k) 

h ^ h 

E + - E l-^4(Aj, -Afc, Afc)4(A,)4(Afc 



< lim sup — < 

n— >oo 



n 

3=1 j^k=l 



< C sup |V'/i(A)p — > as /i — > oo 

Ae[0,27r] 

by Fejer's theorem. 

Letting 5,, = (27r)-i((io, 2di(l-l//i), • • • ,24_i//i)', 7„(A;) = n"! E"=f' Uj'^j+\k\ 
and7„(/i) = (7„(0), 7„(1), • • • ,%{h- 1))', then 

-= - E(4,-)]^h(A,) = 1^ E (1 - l^l/^)4{7n(A;) - E7,(fe)} 

i=i ^ |fc|</i 

= v^i?;,[7JM-E7j/i)]. (24) 

By Theorem 1 in Wu (2005), for fixed heN, 

\\VoUtUt+h\\ < \\utut+h - u'tu't+hW < C{hit) + 64(1 + h)). 

Then Assumption 12.31 implies that '^'^q WVoUtUt+hW < cOj which subsequently 
leads to the joint asymptotic normality of ■y/n{7„(/i) — E7„(/i)} in view of Theorem 



20 



l(i) in Hannan (1973a) or Lemma 1 in Wu and Min (2005). Finally, it is easy to 
see that o"^(c) approaches cr^ as c | 0. Thus the conclusion follows. 



Lemma 6.3. Under the assumptions in Theorem \3.2l the random variables L3„(^) 
and Lin{9) defined in M5\l and satisfy 

sup |L3„(^)/Li„(^)| =Op(l). 
Proof of LemmaE^l Note that Tj{d) = e'^^ Y.T=mo^m+li'^ " e*^0~''^r. We have 

mVmo 

|rj(d)|2 < C ^ |1 -e^^^r^^Z^. 

r=mo Am+l 



Since < Ci < K{\; 9) < C2 < oo for any A and 9, we get 

Lin{9) > |l-e^^^f'^-2'^° 

and 



n . 



n 

r=moAm+l j=l 



When m > tuq, = Op{n ^) for r = niQ + 1, • • • , m. So, uniformly in G Gi, 

T (a\ ^n~l^m II _ piAj|2{d-r) 

< CO„(n-^) ^^'=^^7"'°+^'' " ' = 0„(n-i) = oJl). 

Lin(^) " ^«-i|i_e^A,|2(d-do) ^ Pi ; 

When m < tuq and 6* G 61, m = mo — 1, dxo < 0, dx > and 
Op(n2('^0"™o)). Therefore, 



mo 



T {0\ l-l — pi^j\'2id-mQ)Q ( 2{do-mo)\ 



Li„(0) - ^«-i|i_e^A,|2(d-do) 
uniformly in ^ G ©i. The conclusion follows. <C> 

Denote by c^a = (io - 1/2 + A for some A G (0, 1/4). 

Lemma 6.4. Suppose that the assumptions of Theorem \3.^ are satisfied. Then as 
n — > 00, uniformly in d & [ai,dA] (if it is nonempty), 



Ln{9) > C{l+Op{l))Jn{dA), Jn(dA) = ^^.^j /nf''^ Ixj\l - e 
where C is a positive constant that does not depend on A. 
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Proof of Lemma 16.41 The proof follows the argument in the proof of Lemma 4.2 
of Abadir et al. (2007). For the sake of completeness, we present the details here. 
Note that when d £ [ai,dA], 

„ h 

Ln{e) > - V(j7n)2'^^/y(A,-; > C{Jn{dA) " 2|C„(d)| + \Bn{d)\) 
i=i 

for some C > 0, where 



Bn{d) 



1 " 

-Y.{j/nf^^wy{X,;do)T,{d) 
1 " 



We shall show that Cn{d) = Op{\){Jn{d/\) + Bn{d)). Then the conclusion follows 
since Bn{d) > 0. To this end, let 6 £ (0,1/4) be a small fixed number. Write 
Cn{d) < (27r)-^»(L>„,i + Dn,2). where 



D. 



n,l = n 



L5nJ 

Y.^j/n?''^-'''v,T,{d) 



Dn_2 = n~ 



1 ^n, 



j=l+lSn\ 



do 



with Vj = wx{Xj){l - e*^0"™° * A 

We shall first show that DnA < Cis{l + Op(l))(i3„,(d))^/^, where the constant 
Cis > does not depend on d and n, and Cig — > as 5 — > 0. By the Cauchy- 
Schwarz inequality, 



lSn\ 



2dA-2do\„.^2 



1/2 / - \ 1/2 

n 

n-'Y{j/nr-\T,{d)\' 
j=i 



where the square of the first term is 



[Sn] 



n 



-1+2A 



c 



[Sn] 



C / X 



-1+2A 



dx = Cu. (25) 







Note that the constant C in the preceding display does not depend on 5 and the 
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convergence in (|25p holds since by Lemmas 16.51 16.61 and 16.81 

I5n\ 

-^{j/n)-'^'^iE\\gj\'-\hj\'\) = 0(1), 



I5n\ 

i=i 

/ lSn\ \ 

var(i^0yn)"^+2^|/^/j 



-1+2A 



dx and 



0(1). 



Next, we show that Dn,2 = Op(l)(S„((i))i/^ Let B{d) = E^m+i • Denote 
by Sj = ELL<5nJ+i '"I *i = U/n)^'^^~'^°Tj{d), j = [6n\ + 1, • • • ,n. Summation 
by parts imphes that 

n 

Dn,2<Cn-^ ^ \Sj\\tj -tj+l\ + \Sntn\■ 
j=[Sn^ 

Following the same argument as in Lemma 4.2 of Abadir et al. (2007), the mean 
value theorem implies that 

\tj-tj+i\ < C25r\B{d)f'\ and \tj\ < C25{B{d)f'^ 



uniformly in d G [ai, ^a] and j = \5n\ + 1, 
So we have 



, n. 



n-l 



Dn,2 < C25{B{d)Y'^Vn, Vn = n'M ^ \Sj\r^ + \Sn 

\j=V&n\ 

By Lemma l6.5| when 1 < A; < j < n, 



|E(fiW)| 



IE(ffifffc)| <Clogj7fe 



and E(vjU7) < CE{gjgj) < C(l + logjVi). So E{S]) = 0{jlog^j), which implies 
lE(V^) = o(l) and Vn = Op(l). Finally, in view of Lemma l6.9l there exists an > 0, 
such that 

2 



C 



Bn{d) > -JZU/n 



,2mo+l 



mo 



r'=m+l 



mo 



r=m+l 



So uniformly in d G [ai,dA], Cn{d) = Op(l)(S„(d))V2 ^ Op(l)(l + Bn{d)). Since 



Jn{dA] 



n 



I— 2mo 



> (by Lemma EU, C„(d) 



Op(l)( J„((iA) + Bn{d)). Therefore the conclusion follows. 
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6.3 Auxiliary Lemmas 

We introduce the following working assumption, which holds for uncorrelated pro- 
cess {ut}. 

Assumption 6.1. Assume that Yl,k& \^lu{k)\ < oo, where ju{k) = cov(ut, u^+yt). 

The following three lemmas are extensions of Lemmas 6.2-6.4 in Shao and Wu 
(2007b), where the results hold uniformly in j = 1, • • • ,m, m = o{n). 

Lemma 6.5. Under Assumptions \2.1\ and \6.1l the following expressions hold uni- 
formly in I < k < j < h: 

\E{gjgj} - 1| + mhfh,} - 1| + - A,/\A,\\ = 0(logi/j); (26) 

E{gjgj} = 0(logi/j), Eig^g^} = 0{\ogj/k), E{gfgk} = 0{\ogj/k); 
E{hjhj] = 0{\ogj/j), E{hjhk] = 0{\ogj/k), E{hfhk] = 0{\ogj/k); 
E{gjh,} = 0{\ogj/j), E{gjhk} = 0{\ogj/k), E{gfhk} = 0{\ogj/k). 

Proof of Lemma 16.51 The proof largely follows the argument in Theorem 2 of 
Robinson (1995a), where the uniformness is proved for j = 1, • • • ,m = o(n). A 
detailed check of its proof shows that the argument still goes through. 



Lemma 6.6. Suppose Assumptions WJ\ and [6n\ hold. Then 

E\ixj - iuj\ = 0(j~-^/^) uniformly in j = 1, • • • ,n. (27) 

Proof of Lemma 16.61 It follows from the argument of Lemma 6.3 in Shao and Wu 
(2007b) and the fact that 

2 

dX = 0(1/ j) uniformly in j = 1, • • • , n. (28) 



J —IT 



A{Xj] 



The proof of (128^ basically repeats the argument in Robinson's (1995b) Lemma 3 
and is omitted. 



Lemma 6.7. Suppose A s sumptions [KTl 2.^ and \6.1\ hold. Then 

< C{r^/\l + logr)^/2 ^ ^1/2^-1/4)^ ^ < 

where C is a generic constant independent of r and n. 



E 



^(/Xi - luj) 
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Proof of Lemma l6.7t The proof repeats the argument in Lemma 6.4 of Shao and 
Wu (2007b), where the two key results needed are the absolute summability of 
4-th cumulants [i.e. Assumption 12. 2| and (j28p . We omit the details. 

Lemma 6.8. Under Assumptions \2.2\ and \6.1[ we have 

covilujJuk) = l(j = k)[fuj+oil)] + l(j / k)[27Tn-^UiXj,-Xk,Xk) + o{l/n)] 
uniformly in j, k = 1,2, ■■ ■ , n. 
Proof of Lemma l6.8t Note that 

COv{Iuj,Iuk) = ^{WujWuk)^{WujWuk) + ^iWuj'l^)K(w:^'Wuk) 

+cum{wuj , WTj, Wuk , uhH) ■ 
Under Assumption 16.11 we have 



n 

nwujw^k) = ^ E - = 0{l/n) 

t,s=l 
n 

nyJujW^Ii:) = —^Ut-s)e''^^~''^'^ = l{j = k)[f{\,) + o{l)] + 0{l/n). 

t,s=l 

Further, Assumption 12.21 implies that 

1 " 

cum.{wuj , '^uk , w^l) = ^^2^2 cum(nti , ut^ , up^ , ) 

ilit2,t3i*4 = l 

^i[tiXj-t2Xj+t3\^.~t4\k] 
^ n—1 

= E cumK,u,,,u,„u,3)e*[-'^i"^+'^^"'=-'^3^^] 

hi,h2,h3=l-~n 

[n - 1 + A /ii A /i2 A /i3 - V /ii V /i2 V /is] = —U{\j, -Xk, Afc) + o(l/n), 

n 

where we have applied the Lebesgue dominated convergence theorem above. The 
conclusion follows by noting that all the results above hold uniformly in j, k = 
l,2,---,n. 
The following lemma is analogous to Lemma 4.3 in Abadir et al. (2007) and 
our argument seems simpler. 

Lemma 6.9. Let q > be a fixed integer. Then there exists ry > such that, as 
n — > oo, uniformly in oq, ■ ■ ■ , G M, 



^(l_e*A,)-.^^ 



r=0 



>r7E«'- (29) 

r=0 
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Proof of Lemma [6.91 Write the left-hand side of (j29p as ^ arasDn{r, s), where 

n 

i=i 

Note that Dn{r, s) — > D{r, s) for r, s = 0, • • • , g, where 

1 fl/2 2q+l 

It is easy to see that D = (D{r, s))r,s,=o,- - ,q is a positive semidefinite Hermitian 
matrix, so all the eigenvalues are real and nonnegative. We proceed to show that 
no eigenvalues are zero. Suppose that there exists a vector (Gq, Gi, - ■ ■ , Gq)' such 
that J2l=oD{r,s)Gs = for r = 0, • • • , g. Let G{x) = ELo(l - e-^2^^)^-^G,. 
Then 

rl/2 2q+l 

I jr^-^i^--''n'-'G{x)dx = 0, r = 0,-..,q. 

Thus we get 

"1/2 2g+l fl/2 ^2q+l 



I |i_,.2..|2, |g(")l'^" = I jr^^^G{x)Gix)dx 

1 rl/2 2q+l 

= Y.Grl -3-=G(x)(l-e^2-r^dx = 0, 



which implies that G{x) = for almost all x G [0,1/2). Consequently, Gs = 0, 
s = 0, 1, • • • ,q. Let 2?] > be the smallest eigenvalue of D. Then for large 
enough n, the eigenvalues of Dn = {Dn{r, s))r,s=o,- ,q are no smaller than tj. This 
completes the proof. 
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Table 1: The bias and mean squared error (MSE) of d for the FARIMA(2, d, 
model with GARCH(1,1) innovation ([n]) when (a), n = 200 and (b) n = 512. 



(a) n — 200 


d 


W-1 


W-3 


W-4 


EW 




-0.4 


0.0061 


-0.0920 


-0.2100 


-0.0280 




0.4 


-0.0115 


-0.0829 


-0.2054 


-0.0453 




0.6 


-0.0429 


-0.0759 


-0.1995 


-0.0208 




0.9 


-0.0099 


-0.0595 


-0.1914 


-0.0461 


rSias 


i.i 


-U.ioo9 


-U.U4Z / 


-U.io4i 


-U.U4oy 




1.4 


-0.4393 


0.0010 


-0.1707 


-0.0453 




1.5 


-0.5469 


0.0724 


-0.1658 


-0.0042 




2.0 


-0.9970 


0.1549 


-0.1369 


-0.0482 




2.4 


-1.4237 


0.3539 


-0.1055 


-0.0453 




-0.4 


0.9910 


3.2282 


12.0725 


1.0767 




0.4 


0.9853 


3.0389 


11.854 


1.2049 




0.6 


1.0211 


2.8958 


11.359 


0.8644 




0.9 


1.4808 


2.5940 


11.002 


1.1459 


MSE X 100 


1.1 


5.7039 


2.3579 


10.659 


1.1765 




1.4 


24.397 


2.1186 


10.160 


1.2049 




1.5 


32.610 


2.9531 


10.040 


0.9486 




2.0 


103.19 


5.1102 


9.7201 


1.1650 




2.4 


204.77 


16.6550 


9.4931 


1.2049 



(b) n = 512 


d 


W-1 


W-3 


W-4 


EW 




-0.4 


0.0094 


-0.0346 


-0.0689 


-0.0041 




0.4 


-0.0007 


-0.0316 


-0.0673 


-0.0139 




0.6 


0.0026 


-0.0288 


-0.0657 


-0.0034 




0.9 


-0.0385 


-0.0220 


-0.0622 


-0.0158 


Bias 


1.1 


-0.1424 


-0.0145 


-0.0591 


-0.0174 




1.4 


-0.3695 


0.0076 


-0.0532 


-0.0139 




1.5 


-0.4894 


0.0491 


-0.0509 


0.0104 




2.0 


-0.9775 


0.0948 


-0.0361 


-0.0170 




2.4 


-1.3934 


0.2474 


-0.0199 


-0.0139 




-0.4 


0.3659 


0.8356 


2.1561 


0.3599 




0.4 


0.3208 


0.8156 


2.1719 


0.3517 




0.6 


0.3226 


0.7961 


2.1558 


0.3430 




0.9 


0.4886 


0.7547 


2.1161 


0.3545 


MSE X 100 


1.1 


2.5569 


0.7203 


2.0792 


0.3552 




1.4 


14.456 


0.6973 


2.0101 


0.3517 




1.5 


24.656 


1.1706 


1.9841 


0.3669 




2.0 


96.696 


1.9106 


1.8475 


0.3560 




2.4 


195.12 


8.4293 


1.7640 


0.3517 
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Table 2: The bias and mean squared error (MSE) of 0i for the FARIMA(2, d, 
model with GARCH(1,1) innovation ([n]) when (a), n = 200 and (b) n = 512. 



(a) n — 200 


d 


W-1 


W-3 


W-4 


EW 




-0.4 


-0.0072 


0.0400 


0.0925 


0.0106 




0.4 


-0.0066 


0.0357 


0.0903 


0.0118 




0.6 


-0.0142 


0.0326 


0.0873 


0.0051 




0.9 


-0.0745 


0.0254 


0.0845 


0.0162 


Bias 


1 -1 
i.i 


-U.zi ( D 


U.Uioi 


U.UoiD 


U.UiOo 




1.4 


-0.4684 


-0.0006 


0.0764 


0.0117 




1.5 


-0.5458 


-0.0305 


0.0746 


0.0062 




2.0 


-0.6195 


-0.0620 


0.0643 


0.0163 




2.4 


-0.6155 


-0.1172 


0.0526 


0.0118 




-0.4 


0.7545 


1.7357 


4.6639 


0.7599 




0.4 


0.7668 


1.7158 


4.6111 


0.7673 




0.6 


0.8337 


1.6925 


4.4636 


0.6759 




0.9 


2.1480 


1.6399 


4.4214 


0.7584 


MSE X 100 


1.1 


8.8843 


1.5993 


4.3566 


0.7604 




1.4 


27.606 


1.5721 


4.2666 


0.7670 




1.5 


32.433 


1.7772 


4.2542 


0.5652 




2.0 


40.138 


2.0674 


4.2813 


0.7592 




2.4 


38.784 


2.9361 


4.2722 


0.7672 



(b) n = 512 


d 


W-1 


W-3 


W-4 


EW 




-0.4 


-0.0067 


0.0140 


0.0327 


0.0003 




0.4 


-0.0057 


0.0126 


0.0320 


0.0014 




0.6 


-0.0115 


0.0113 


0.0313 


-0.0002 




0.9 


-0.0779 


0.0082 


0.0298 


0.0044 


Bias 


1.1 


-0.2859 


0.0047 


0.0285 


0.0043 




1.4 


-0.5778 


-0.0056 


0.0259 


0.0015 




1.5 


-0.6147 


-0.0245 


0.0249 


-0.0034 




2.0 


-0.6599 


-0.0443 


0.0185 


0.0046 




2.4 


-0.6501 


-0.0951 


0.0115 


0.0015 




-0.4 


0.2856 


0.5727 


1.1368 


0.2810 




0.4 


0.2702 


0.5707 


1.1508 


0.2672 




0.6 


0.2901 


0.5680 


1.1502 


0.2733 




0.9 


1.5366 


0.5618 


1.1458 


0.2726 


MSE X 100 


1.1 


11.753 


0.5577 


1.1406 


0.2691 




1.4 


35.090 


0.5700 


1.1291 


0.2672 




1.5 


38.582 


0.7093 


1.1245 


0.2275 




2.0 


43.940 


0.8519 


1.0990 


0.2709 




2.4 


42.661 


1.6801 


1.0830 


0.2672 
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Table 3: The bias and mean squared error (MSE) of 02 for the FARIMA(2, d, 
model with GARCH(1,1) innovation ([n]) when (a) n = 200 and (b) n = 512. 



(a) n = 200 


d 


W-1 


W-3 


W-4 


EW 




-0.4 


0.0078 


-0.0013 


-0.0112 


0.0077 




0.4 


0.0112 


-0.0007 


-0.0109 


0.0105 




0.6 


0.0217 


-0.0001 


-0.0100 


0.0077 




0.9 


0.1256 


0.0013 


-0.0091 


0.0067 


Bias 


1.1 


0.3226 


0.0027 


-0.0082 


0.0073 




1.4 


0.5511 


0.0067 


-0.0067 


0.0105 




1.5 


0.6015 


0.0133 


-0.0061 


0.0063 




2.0 


0.6081 


0.0236 


-0.0031 


0.0068 




2.4 


0.6158 


0.0629 


0.0005 


0.0105 




-0.4 


0.4422 


0.8304 


1.1368 


0.4439 




0.4 


0.4715 


0.8302 


1.2365 


0.4750 




0.6 


0.5650 


0.8330 


1.2421 


0.4392 




0.9 


3.4191 


0.8408 


1.2340 


0.4445 


MSE X 100 


1.1 


14.553 


0.8495 


1.2369 


0.4497 




1.4 


32.547 


0.8735 


1.2481 


0.4744 




1.5 


36.947 


0.9145 


1.2523 


0.3709 




2.0 


37.185 


1.0062 


1.2853 


0.4467 




2.4 


38.044 


1.4756 


1.3220 


0.4749 



(b) n = 512 


d 


W-1 


W-3 


W-4 


EW 




-0.4 


0.0040 


0.0029 


0.0019 


0.0041 




0.4 


0.0061 


0.0030 


0.0020 


0.0061 




0.6 


0.0128 


0.0032 


0.0021 


0.0042 




0.9 


0.1161 


0.0035 


0.0024 


0.0037 


Bias 


1.1 


0.3559 


0.0039 


0.0026 


0.0040 




1.4 


0.5636 


0.0051 


0.0031 


0.0061 




1.5 


0.5905 


0.0076 


0.0033 


0.0044 




2.0 


0.5969 


0.0113 


0.0045 


0.0038 




2.4 


0.6009 


0.0354 


0.0059 


0.0061 




-0.4 


0.1729 


0.3392 


0.5846 


0.1734 




0.4 


0.1793 


0.3393 


0.5847 


0.1800 




0.6 


0.2090 


0.3394 


0.5852 


0.1730 




0.9 


2.6947 


0.3399 


0.5863 


0.1730 


MSE X 100 


1.1 


16.354 


0.3406 


0.5872 


0.1742 




1.4 


32.879 


0.3438 


0.5891 


0.1801 




1.5 


35.414 


0.3519 


0.5899 


0.1455 




2.0 


35.700 


0.3631 


0.5949 


0.1735 




2.4 


36.174 


0.5482 


0.6009 


0.1801 
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